Virginia Public Schools enroll over 1.2 million students and include around 2000 schools with 6th highest attainment rate of post secondary education in the country. The school system need to be properly preparing their students for the future especially those from historically marginalized demographic groups which in recent years have significantly increased post secondary attainment rates. The goal of this visualization is to compare fail rates of different demographic groups in order to look at preparedness for the future and potentially further education opportunities.
In order to compare educational preparedness I utilized fail rates on Standards of Learning (SOLs), Virginia’s state mandated tests collected by the Virginia Department of Education.I decided to put an emphasis on county level averages rather than individual school or state level in an attempt to minimize the influence of outlier’s and due to website downloading issues. The data includes subsets for various demographic group such a gender, race, socioeconomic status, disability status, and home status.
I decided to encompass the final visualization into a Dashboard in order to be able to view each graph separately instead of having 7 on the same page. The first tab depicts the averages for all counties in a map so the viewer can easily see the increase of scores over time. This is paired with the arrow graph in order to see the positive improvement for all demographic groups. All the other tabs allow for the user to focus in on a certain comparisons with the histogram allowing for an easy way to view the distribution of score averages of counties. This is potentially a draft for a interactive website to explore data about Virginia Public Schools.
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
pacman::p_load(here, tidyverse, plotly, dplyr, sf, ggthemes, tidygeocoder, magick, stringr, flex_dashboard) #Necessary Packages
us_counties <- read_sf(here( "data", "data", "cb_2021_us_county_500k","cb_2021_us_county_500k.shp")) #County SF
va_counties <- us_counties %>%
filter(STATE_NAME == "Virginia") #Sub set to only include Va counties
school <- read_csv(here("Datasets", "Assessments.csv")) #Read in assessment data from VDOE
school_2 <- school %>%
filter(Division == "Williamsburg-James City County Public Schools" | Division == "Greensville County Public Schools" | Division == "Alleghany Highlands Public Schools" | Division =="Fairfax County Public Schools") %>%
mutate(Division = case_when(
Division == "Williamsburg-James City County Public Schools" ~ "James City County",
Division == "Alleghany Highlands Public Schools" ~ "Covington city",
Division == "Fairfax County Public Schools" ~ "Fairfax city",
Division == "Greensville County Public Schools" ~ "Emporia city", #Change name of counties to match SF
TRUE ~ Division# Keeps all other names unchanged
))
school_3<-school %>%
bind_rows(school_2) #Recombine the updated shape files
cl_sch <- school_3 %>%
filter(Subgroup =="All Students" & Subject=="Mathematics") %>% #Subset to only include all student averages and math assessments
mutate(Division = str_remove(Division, "Public Schools$")) %>% #Drop Public school from the end of county names
mutate(
Division = str_replace_all(Division, regex("city", ignore_case = TRUE), "city")
)%>% #find all instances of city ignoring casing and change to "city"
mutate(Division = trimws(Division, which = "right")) %>% #remove weird spacing
mutate(Division = case_when(
Division == "Charles city County" ~ "Charles City County", #Fix Capitalization on weird cases
Division == "Williamsburg-James city County" ~ "Williamsburg city",
Division == "Alleghany Highlands" ~ "Alleghany County",
Division == "James city County" ~ "James City County",
TRUE ~ Division # Keeps all other names unchanged
)) %>%
mutate(Fail = as.numeric(Fail)) %>% #Turn fail from a character value to a numeric
rename(NAMELSAD = Division) #Rename division column to NAMELSAD in order to match SF
va_school <- left_join(va_counties, # dataset for county maps
cl_sch, #fail rate information
by = "NAMELSAD")
p <- ggplot(data = va_school, na.rm=TRUE) +
geom_sf(aes(fill = Fail, text= paste("County Name:", NAMELSAD, "\n Percentage of Students who Failed:", Fail, "%" ))) +
ggthemes::theme_map() +
scale_fill_continuous(low = "green", high = "red",
name = "Percentage Failed", guide="none") + #low fail rates are classified as green/high as red
labs(title = "Red Indicates High Fail Rates while Green Repersents Lower Fail Rates.")+
facet_wrap(~Year, nrow=3) #3 maps of Virginia in a line
ggplotly(p, tooltip="text") #add interactive elements